Language Editing Dataset of Academic Texts
نویسنده
چکیده
We describe the VTeX Language Editing Dataset of Academic Texts (LEDAT), a dataset of text extracts from scientific papers that were edited by professional native English language editors at VTeX. The goal of the LEDAT is to provide a large data resource for the development of language evaluation and grammar error correction systems for the scientific community. We describe the data collection and the compilation process of the LEDAT. The new dataset can be used in many NLP studies and applications where deeper knowledge of the academic language and language editing is required. The dataset can be used also as a knowledge base of English academic language to support many writers of scientific papers.
منابع مشابه
Writers on the Move: Visualizing Composing Processes Involved in Academic Writing
The present research study aimed to explore covert processes of editing and revision which were involved in writing four different academic text genres (i.e. abstract, conclusion, data commentary, and cover letter) in English language. To this end, six EFL learners with Persian as their mother were recruited to participate in this study. All the participants attended an induction session and ea...
متن کاملFrom Academic to Journalistic Texts: A Qualitative Analysis of the Evaluative Language of Science
This study examined academic articles and journalistic reports in 5 disciplinary areas to explore how similar contents might attitudinally be realized in two different genres. To this end, 25 research articles and 210 news reports were carefully selected and underwent detailed discourse semantic and grammatical analyses with the purpose of identifying the evaluative linguistic patterns....
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملTeaching Academic Vocabulary Through Reconstruction Editing Task: Does Group Size Matter?
The use of collaborative classroom interactional tasks is on the rise recently since they incorporate the negotiation of meaning and thus they may be regarded as one of the most efficient ways to ease a learner’s focus on form. This study investigated the immediate and long-term effects of reconstruction editing task on the learning of 20 academic vocabulary items through using five reconstruct...
متن کاملPersian Speakers’ Recognition of English Relative Clauses: The Effects of Enhanced Input vs. Explicit Feedback Types
Despite consensus in focus on form (FOF) instruction over the facilitative role of noticing, controversy has not quelled over ways of directing EFL learners’ attention towards formal features via implicit techniques like input-enhancement or explicit metacognitive feedback and interactive peer-editing on the output they produce. This quasi-experimental study investigated the impact of input enh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014